Goto

Collaborating Authors

 homogeneous space


Latent SDEs on Homogeneous Spaces

Neural Information Processing Systems

We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE).



Latent SDEs on Homogeneous Spaces

Neural Information Processing Systems

We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the unobserved solution of a latent stochastic differential equation (SDE). Motivated by the challenges that arise when trying to learn a latent SDE in $\mathbb{R}^n$ from large-scale data, such as efficient gradient computation, we take a step back and study a specific subclass instead. In our case, the SDE evolves inside a homogeneous latent space and is induced by stochastic dynamics of the corresponding (matrix) Lie group. In the context of learning problems, SDEs on the $n$-dimensional unit sphere are arguably the most relevant incarnation of this setup. For variational inference, the sphere not only facilitates using a uniform prior on the initial state of the SDE, but we also obtain a particularly simple and intuitive expression for the KL divergence between the approximate posterior and prior process in the evidence lower bound. We provide empirical evidence that a latent SDE of the proposed type can be learned efficiently by means of an existing one-step geometric Euler-Maruyama scheme. Despite restricting ourselves to a less diverse class of SDEs, we achieve competitive or even state-of-the-art performance on a collection of time series interpolation and classification benchmarks.


A General Theory of Equivariant CNNs on Homogeneous Spaces

Neural Information Processing Systems

We present a general theory of Group equivariant Convolutional Neural Networks (G-CNNs) on homogeneous spaces such as Euclidean space and the sphere. Feature maps in these networks represent fields on a homogeneous base space, and layers are equivariant maps between spaces of fields. The theory enables a systematic classification of all existing G-CNNs in terms of their symmetry group, base space, and field type. We also answer a fundamental question: what is the most general kind of equivariant linear map between feature spaces (fields) of given types? We show that such maps correspond one-to-one with generalized convolutions with an equivariant kernel, and characterize the space of such kernels.


Latent SDEs on Homogeneous Spaces

Neural Information Processing Systems

We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE).




A General Theory of Equivariant CNNs on Homogeneous Spaces

Neural Information Processing Systems

We present a general theory of Group equivariant Convolutional Neural Networks (G-CNNs) on homogeneous spaces such as Euclidean space and the sphere. Feature maps in these networks represent fields on a homogeneous base space, and layers are equivariant maps between spaces of fields. The theory enables a systematic classification of all existing G-CNNs in terms of their symmetry group, base space, and field type. We also answer a fundamental question: what is the most general kind of equivariant linear map between feature spaces (fields) of given types? We show that such maps correspond one-to-one with generalized convolutions with an equivariant kernel, and characterize the space of such kernels.


Equivariant non-linear maps for neural networks on homogeneous spaces

Nyholm, Elias, Carlsson, Oscar, Weiler, Maurice, Persson, Daniel

arXiv.org Machine Learning

This paper presents a novel framework for non-linear equivariant neural network layers on homogeneous spaces. The seminal work of Cohen et al. on equivariant $G$-CNNs on homogeneous spaces characterized the representation theory of such layers in the linear setting, finding that they are given by convolutions with kernels satisfying so-called steerability constraints. Motivated by the empirical success of non-linear layers, such as self-attention or input dependent kernels, we set out to generalize these insights to the non-linear setting. We derive generalized steerability constraints that any such layer needs to satisfy and prove the universality of our construction. The insights gained into the symmetry-constrained functional dependence of equivariant operators on feature maps and group elements informs the design of future equivariant neural network layers. We demonstrate how several common equivariant network architectures - $G$-CNNs, implicit steerable kernel networks, conventional and relative position embedded attention based transformers, and LieTransformers - may be derived from our framework.


Reviews: A General Theory of Equivariant CNNs on Homogeneous Spaces

Neural Information Processing Systems

I would give an accept score if I were able to have a look at the new version and be happy with it (as is possible in openreview settings for example). However since improving the presentation usually takes a lot of work and it is not possible for me to verify in which way the improvements have actually been implemented, I will bump it to a 5. I do think readability and clarity is key for impact as written in my review, which is the main reason I gave a much lower score than other reviewers, some of whom have worked on exactly this intersection of algebra and G-CNNs themselves and provided valuable feedback on the content from an expert's perspective. The following comments are based on the reviewer's personal definition of clarity and good quality of presentation: that most of the times when following the paper from start to end it is clear to the reader why each paragraph is written and how it links to the objective of the main results of the paper, here claimed e.g. in the last sentence to be the development of new equivariant network architectures. The paper is one long lead-up of three pages of definitions of mathematical terms and symbols to the theorems in section 6 on equivariant kernels which represent the core results of the paper. In general, I appreciate rigorous frameworks which generalize existing methods, especially if they provide insight and enable the design of an arbitrary new instance that fits in the framework (in this case transformations on arbitrary fields).